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The subject of this talk “** "New Tests for Little Children — ~ 

Is, as you may Imagine, a very large subject Indeed. Its magnitude 
Is attested by the fact that the Head Start Test Collection, which 
resides In the library at ETS, and Is a cooperative venture with OCD, 
now contains 908 different tests, with more coming In ail the time. 
And the Collection does not Include an additional 2199 research 
Instruments that are standing In the wlags hoping to qualify for 
a regular part In the Collection Itself. 

Going literally from A to Z, the t:ll:le8 In Head Start 
Test Collection range from the "ABC Inventory," whlclt'.^ a school 
readiness measure, to the "Zip Test" which is another readiness 
measure for migrant children with Spanish speaking background. Most 
of the tests In the Collection are quite recent, like the "Thomas 
Self-Concept Values Test" (1969), but some are old standbys, like 
the "Vineland Social Maturity Scale" (1936). The tests cover a lot 
of territory in both the cognitive and affective domain. Some of 
them are psychometrlcally respectable; some are trying to become 
respectable; and some are Innocent of any known psychometric 
properties whatever. 

Well, having paid my respects to the' subject on which I was 
o expected to talk, I shall now follow the usual practice of changing 
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the subject to fit my own concerns. VJhat I want to talk about is 
*’Some Old Problems in New Settings” in the testing of young children. 
This is not such a big switch as you might think, because the spate 
of new tests, especially for young children, is itself part of two old 
problems — namely, the problem of trying to keep up with all the 
new tests that are forever coming on the scene and the much tougher 
problem of trying to figure out which ones yield some dependable 
information about children and xjiiich ones are mostly flights of 
fancy, or reincarnations of old and sometimes unsatisfactory gimmicks. 
But there are some deeper problems as well that I think need 
attending to, and I want to discuss some of them. 

I 

For a short spell, about 25 years £tgo, I taught a course called 

"Educational Measurement It was a required one-semester course 

* 

for people who were bucking for a master’s degree in education. 

Most of the students were already in teaching or administrative 
jobs, and a lot of them were working in elementary schools, or 
headed in that direction. 

The over— all objectives of the course, in my mind at least, 
were (1) to get the students to acquire a decent respect for data, 

(2) to distinguish between dependable data and untrustworthy data, 
and (3) to secure some practice in generating good data for their 
own classroom use, particularly test data, and I mean "tests” in a 
broad sense to include not just paper and pencil devices, but any 
of various ways of observing and sizing up pupil performance. 
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I learned a lot from teaching that course — probably more than 
the students learned. One of the things I learned was that many of 
the teachers — especially the teachers of young children — became 
traumltlzed by anything that smacked of statistics -- even such simple 
statistics as means, standard deviations, and standard errors of 
measurement, which In my view are concepts that should and could 
be mastered routinely by every child before the age of 12. I'll 
give the kids until age 15 to get a firm grip on things like 
correlation and the fundamentals of regression analysis. And I'm 
only half joking. 

Another thing I learned from the course was that most of those 
teachers — and the administrate, ..s too — tended to be. very sloppy 
In the terms they used for characterizing pupils and their learning, 
for describing explicitly what goes on In the teaching-learning 
process, and for specifying t he goals they thought they and their 
pupils should be reaching for. In short, they seemed to be quite 
unacquainted with rigor In thought and talk about e ’’icntlonal matters, 
and Indeed seemed to deplore the exercise of rigorous thinking 
altogether, as though It might conceivably interfere with the exercise 
of compassion In their dealings with the young. It took me the 
whole semester to get some of them to see that the rigors required 
by measurement are not Incompatible with good teaching. In fact, 
that careful, dispassionate observation of how the children are 
getting along Is a necessary Ingredient of careful compassionate 
teaching. 
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To ray great regret, I do not know how ranch effect course 

raay have had in reducing Ignorance about educaCiOftal measurement 
among the students who took the course, nor do I know hbW rauch it 
may have Increased the rigor with which they handled thetr educational 
problems. What I do know is that that kind of course, which was 
aimed^I remind you^^t the general practitioner, Was a rarity then, 
and I suspect from what I read and hear and &&& that it is even 
more of a rarity today. I think this is too bad. In fact, I think 
it is dangerous, for if there is a flight from the dlaotpline of 
measurement in the training of teachers and other educational 
personnel, the schools are in for even raore tror»hle than they are 
now experiencing. Which is another way of saying that 1 think a 
good part of the cause of what Sllberman calls the "Crisis in the 
Classroom" is that too many educators have too little respect 
for the dignity of data. 

II 

Let me get a bit more specific by taking a look at some of 
the canfussi thinking that seems to accompany the testiftg °f daiJ-dren. 
As I 3 «e it there are two common misconceptions of the basic functions 
of tests — that is, uses to which they get put but which have nothing 
to do with their function as measuring ins '.ruments - Probably ti*e 
most cotmmon misconception is the one that conceives of a test aa 
an Inceaitlve to study; that is , as a club to coerce children ±bto 
doing stfiat it is supposed they would not otherwise do* Most -teachers 
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have been taught, but too soon forget, that saying to children. You 
must learn this or that because you are going to be tested on It" 
undermines the whole educational effort. It Is wrong In two ways. 

It Is wrong educationally because' It gets' the kids to thinking, from 
kindergarten on up, that the main reason for going to school Is to 
learn to pass tests rather than to learn to make some sense out of 

the crazy world they are going to Inherit. 

It Is wrong psychometrlcally , too, because If a child studies 
with the sole purpose of getting a good mark on a test, the chances 
are the results will be inflated, or maybe deflated, and therefore 
will not tell you what you really need to know to help him grow and 
develop. Furthermore, It produces a mind-set in young children 
that stays with them the rest of their lives and that makes almost 
impossible any effort to get , by means of testing, any valid Indica- 
tion of how the child feels about himself or his fellow pupils or his 
teachers, or of any of those other things we assign to the "affective 
domain/* 

A second common misconception of the function of testing is 
that the main purpose of a test is to provide a learning experience 
a series of exercises Cor the child to perform because they might 
help him develop his intellect, or his personality, or his character, 
or whatever. The idea Is that tests are "good for you," like spinach. 

This wrong-headed notion about tests probably does not have such 
serious consequences as the wrong-headed notion that tests are to be 
equated with prods and clubs. The child Is not likely to be seriously 
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harraed by It. In fact. It can be argued that good tests usually 
contain good learning exercises. But to use them as such as to 
destroy their use as measurement devices, and I think It is Important 
for teachers and other school people to come to realize this lest 
the measurement function of tests be. lost In the shuffle. 

For Instance, one college teacher, writing on the uses of 
course examinations worries In print that the measurement function 
of tests will be overs tressed by both pupils and teachers. As a 
result," he says, "It may get In the way of the pedagogic function. 
This happens, for Instance, when In our concern to set examinations 
that can be graded accurately and uniformly, we test only the more 

ii(3) 

measureable academic capacities. 

It seems to me that this misses the taain point of testing 
altogether. It Is saying that accurate measurement of a child s 
learning should not get in the way of helping him to learn. It is 
saying, "Don't sacrifice good teaching to good testing." But this 
line of thinking creates a dilemma where. In my view, there ought 
never to be a dilemma. For good testing, that Is, accurate measure- 
ment pupil performance. Is, I submit^ absolutely indispensable 
to good teaching. You never know whether your teaching is any good, 
or more Importantly, how to make it better, unless you have some 
reasonably dependable way of observing what effects it is having 
on the child's learning -- and by "learning" I don't mean just 
academic learning in the conventional sense, but learning to cope 
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wlth the whole broad array of developmental tasks that have been 
described by scholars like Erik Erlkson^^^ and Robert Havlghurst. 

The primary and most Important product of any test Is the kind of 
Information that tells you how better to help the child learn. If, 
as a byproduct, the tasks In a test happen to provide useful learning 
exercises for the child, so much the better. But If the quality 
of the test as a meaurlng instrument Is neglected, then the main 
point of the testing Is lost. 



Ill 

Having disposed, I hope, of the two principal misconceptions 
of the functions of testing as clubs and character builders, let's 
look at some more appropriate conceptions of what tests are supposed 
to do, and the problems even these can generate. There are three 
functions that stand out: tests for selection, tests for pupil 

guidance, and tests for evaluating instruction. 

Tests for selection have a long history. Such tests, like 
paper, printing, and gundpowder, were invented by the Chinese some 
4000 years ago. As early as 2200 B.C. the emperor of China was 
using achievement tests to decide which of his officials should be 
promoted and which should be fired. Later on they were real 
performance tests, too, covering the five basic arts: music, archery, 

horsemanship, writing, and arithmetic which I suppose made them 
reasonably appropriate "criterion referenced measures" of performance 
In government office. In any case, the iise of achievement tests 
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or aptitude tests (which in my vievj are one and the same thing) for 
selecting people and sorting them out has persisted down through 
the ages and is tinquestionably thelir most prevalent use throughout 
American society today — even of course right down to the preschool 
and kindergarten level where "readiness tests"' have become so 
predominant in delaying the entrance of children to kindergarten 
and the first grade. 

This pervasive use of tests for selection is why tests in 
general are so frequently perceived as a personal threat, why 
cramming for tests is so widely tolerated if not condoned, why 
cheating on tests is so common, why minority groups complain, not 
without good reason, that tests tend to be biased against them and 
keep them from making it into the mainstream. 

But let's face it. This widespread use of tests for purposes of 
selection, for deciding from kindergarten on up who will pass and 
who will fall, who will be winners and who will be losers, is not 
likely to go away in a hurry. For, whether we like it or not, it 
has become indigenous to the kind of competitive culture that 
characterizes all of our social institutions including our educational 
institutions . 

So^ it seems to me, we are faced with an extraordinarily difficult 
problem. If ws propose to use tests primarily for evaluating and 
improving instructional processes, rather than for slapping down 
children and teachers, how are we going to convince the victims that 
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we really mean what we say? . Somehow we have sot to separate the 
functions of testing In fact as well as In the minds of the children 
and the teachers affected. Otherwise the use of tests for selection 
purposes will tend to destroy their usefulness for guidance and for 
pj-Qvldlng the feedbaclc both' the pupils and the teachers must have 
to make things go better In the classroom, or In the "open corridor," 
or wherever. For If children are learning from the very start to 
beat the tests or are learning to be turned off by tests ^ — as I'm 
convinced many children In the ghetto are — the kinds of Information 
the tests yield are hot likely to be very helpful In pointing the 
way to teaching that Is best adapted to the Individual needs of 
children. 

I have no easy answers to this problem, but I think It Is one 
that should be squarely confronted and placed high on the agenda for 
training teachers and school administrators, and^I hasten to add, 

I test specialists and psychometricians as well, since^as I read them, 
most of the developments of psychometric theory tend to center almost 

exclusively around the selection model. 

Although the task of getting out of this bind will be difficult, 
It can be simply stated: It Is one of getting past the usual rhetoric 
that Is spouted In the name of educational evaluation and getting 
down to the hard business of finding practicitl and believable ways 
of dissociating the selective function of testing from Its diagnostic 
or feedback function — and It is here that I think the so-called 



O 

ERIC 






- 10 . 



criterion-referenced tests, pnce we understand better what we mean 
by them, can play a part. At present the theory on which they rest 
is still somewhat confused and fi3»led with controversy.^ 

IV 

Now let’s go back in history and look at another aspect of the 
problem of testing little children. 

In the 67 years since Binet and Simon produced their first 
intelligence scale, we have learned a great deal that can and ought 
to be put to practical use in the testing of young children, provided 
educators can become sufficiently cautious in -exploiting their 
possibilities. 

It should be remembered, first of all, that tests in the Binet 
tradition have from the very beginning been primarily concerned with 
the testing of young children. The Binet— Simon scales of 1905 and 
1908 were aimed at children ranging in age from 3 to 11. In 1912 
Kuhlman’s revision extended the scale down to the age of three 
months i, Terman’s Stanford revision of 1916 started with exercises 
for two-year olds, and most recently, the Wechsler Preschool and 
Primary Scale of 1967 focuses on children from four to six-and-a-half . 
The point I am trying to make is that, taken all together, the tests 
in the Binet tradition represent a vast amount of good empirical 
work in cooking up a great variety of developmental tasks for observing 
the mental functioning of children at early ages. And tests in the 
factor analytic tradition^ culminating in the work of people like 
Guilford^has enriched the item bank still further. 
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lt 13 useful to recall that Binet’s original genius consisted 
chiefly In two fundamental contributions to the measurement of 
children’s cognitive functioning: (1) the Invention and development 

of a set of psychologically complex exercises that differentiated 
children who were doing well In sdtiool from those who were doing 
not so well and (2) the Invention of a normative scale, the mental 
age scale, for ordering the children In accordance with their 
performance on a variety of these tasks. 

The notion of a normative scale In Its time was a real break- 
through. The trouble Is that, as it has become more and more widely 
used In the schools. Its basic nature and Its limitations have not 
been well understood by the people who use It. In my experience, 
most teachers and other school people can't seem to get It through 
their heads that the use of years and months to measure a child's 
cognitive behavior — or for that matter any kind of behavior 
Is a fundamentally different operation from the use of feet and 
Inches to measure a child’s height. Which Is to say that they have 
a hard time getting the Idea that the units In a normative scale 
are not additive, that you cannot In logic say, for Instance, that 
the cognitive functioning of a child with a mental age of eight 
is twice that of a child with a mental age of four. And of course 
when you move over to normative scales based on grades-ln-school , 
you are In even worse trouble because grades-in-school are simply 
arbitrary Inventions for organizing schools and slicing up currlculums 
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in any of a thousand different ways. People don't seem to realize 
that adding or subtracting or averaging grade-equivalent scores Is 
only a little less absurd than It would be. bo add or subtract or 
average a set of social security numbers.' 

It was to Blnet's credit that having devised the mental age 
scale, he stopped there. He did not Invent that rich source of 
useless psychological controversy, the notorious IQ, which has 
distracted too many good minds from more fruitful educational 
endeavors and still does. It seems to me that the whole nature- 
nurture controversy, centering around the IQ, has become a disaster 
from the standpoint of trying to find better ways of teaching the 
young. Its widespread use in the schools has encouraged and 
reinforced the attitude among teachers and others that once you 
have figured a child's IQ at, say the age of five, and have put 
him in the "right slot," the whole academic establishment can rest 
on Its oars and let the Inevitable stream of academic routine do 
its work. The IQ tends to give teachers the notion that they can 
hide behind the numbers and thereby be relieved of any pressing 
obligation to keep studying their pupils to try to determine the 
manifold ways In which their behavior Is changing and developing 
as a consequence of their school experiences. 

The reversal of this attitude has been slow In coming, which 
Is to say that It has taken a terribly long time for Jean Piaget s 
Ideas to begin to percolate Into the thinking of American education 

112 
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to Che point where teachers may, hopefully, become perceptive 
child-watchers and not just textbook watchers or, worse still, IQ 
watchers. Piaget's lectures on the la,nguage and thought of the child 
which were delivered In 1923, represented a radical departure from 
the Blnet tradition In the testing of children. Piaget began his 
Investigations, you remember, by simply watching young children and 
observing the patterns of behavior that emerged as they Interacted 
with the phenomena of their world. He was concerned with getting 
behind the numbers to try to Infer what was actually going on In 
the minds of the children. This was another kind of breakthrough. 

In his book on The Psychology of Intelligence , Piaget himself has 
summed up the contrast between his approach and the Blnet approach 
in two cogent sentences: 



It Is Indisputable, [he says] that these tests of mental 
age have on the whole lived up to what was expected of ^ 
them: a rapid and convenient estimation of an Individual s 

general level. But It Is less obvious that they simply 
measure a "yield" without reaching constructive operations 

themselves. (8) 

It Is of course these "constructive operations" ~ the different 
ways children perceive and think about and Interact with their world 
that must be observed and understood as precisely as possible if 
teaching Is to meet the child where he Is and shape the school 
experience so as to maximize his development. But it Is not until 
quite recently that any systematic attempt has been made to incorporate 
Piaget’s Ideas Into testing procedures that could be used by teachers 
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to enhance children's learning. As far as I know, the first such 
attempt was the project that ETS undertook In some of the 

elementary schools In New York City back around 1964. I- carried 

rgV 

the title Let's Look at First Graders , and Its purpose, 

essentially ,was to help first grade teachers observe and assess the 

specific aspects of the Intellectual develbpment of each pupil as 

accurately and objectively as possible so that they could teach 

them better. 

gjjj; this sort of thing has been an uphill struggle, and as far 
as I can see. It has not yet made much of a dent. I remember talking 
a few years ago to a group of teacher-trainers about this project 
and whipping up quite a lot of enthusiasm about It. After I got 
through, one of the people there said, "You know, this Is great 
stuff, and I've been trying myself to get my own student-teachers 
on top of it- But they just can't seem to grasp the Plagetan 
concepts or to apply them In their own classwork." 

In short. It appears terribly difficult to make most school 
people believe that intelligence has many facets which can be 
observed and assessed If one pays close enough attention to the 
behavior of the pupils. The reason for this hang-up Is that too 
few people realize that Intelligence can be taught and that the 
primary business of schools and school teachers Is to teach it — 
and not just hide behind the numbers generated by paper and pencil 
tests concocted by somebbdy else. Such numbers. If properly used. 
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can be helpful, but I submit that the heart of the process of 
educational measurement ought to be thought of as the disciplined 
observation by teachers of the behavior of their pupils, and 
until we can get this Idea across, most of the teaching, not to 
mention the testing, of little children will continue to be 
Intolerably blind In Its operations and dubious In Its effects. 

I think we can do better. I think It Is up to you to see 
that we do. 
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